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1053-1095; Harrison 1991 Nature (London) 353, 715-719; and Klug 1993 Gene 135, 
83-9-) Sequence specificity results from the geometrical and chemical complementary 
between the amino acid side chains of the a-helix and the accessible groups exposed on 
the edges of base-pairs. In addition to this direct reading of the DNA sequence, 
interactions with the DNA backbone stabilise the complex and are sensitive to the 
conformation of the nucleic acid, which in rum depends on the base sequence (Dickerson 
& Drew 1981 J- Mol. Biol. 149, 761-786). A priori, a simple set of rules rmght suffice 
to explain the specific association of protein and DNA in all complexes, based on the 
possibility that certain amino acid side chains have preferences for particular base-patrs. 
However, crystal structures of protein-DNA complexes have shown that proteins can be 
idiosyncratic in their mode of DNA recognition, at least partly because they may use 
alternative geometries to present their sensory c-helices to DNA, allowing a vanery of 
different base contacts to be made by a single amino acid and vice versa (Matthews 1988 
Nature (London) 335 , 294-295). 

Mutagenesis of Zf proteins has confirmed modularity of the domains. Site directed 
mutagenesis has been used to change key Zf residues, identified through sequence 
homology alignment, and from the structural data, resulting in altered specificity of Zf 
domain (Nardelli e< al., 1992 NAR 26, 4137-4144). The authors suggested that although 
de S1E n of novel binding specificities would be desirable, design would need to take mto 
account sequence and structural data. They state "there is no prospect of achievtng a zmc 
finger recognition code". 

Despite this, manv groups have been trying to work towards such a code, although only 
limited rules have so far been proposed. For example, Desjarlais « al., (1992b PNAS 
89 7345-7349) used svstemauc mutation of two of the three contact residues (based on 
consensus sequences) in finger two of the polypeptide Spl to suggest that a limited 
degenerate code might exist. Subsequently the authors used this to design three Zf 
protems with different binding specificities and affinities (Desjarlais & Berg, 1993 PNAS 
90. 2250-2260). They state that the design of Zf proteins with predictable specifiers and 
affinities "may not always be straightforward". 
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We believe the zinc finger of the TFIIIA class to be a good candidate for deriving a set 
of more generally applicable specificity rules owing to its great simplicity of structure and 
interaction with DNA. The zinc finger is an independently folding domain which uses a 
zinc ion to stabilise the packing of an antiparallel £-sheet against an c-helix (Miller et al. y 
1985 EMBO J. 4, 1609-1614; Berg 1988 Proc. Natl. Acad. Sci. USA 85, 99-102; and Lee 
et al., 1989 Science 245, 635-637). The crystal structures of zinc finger-DNA complexes 
show a semiconserved pattern of interactions in which 3 amino acids from the a-helix 
contact 3 adjacent bases (a triplet) in DNA (Pavletich & Pabo 1991 Science 252, 809-817; 
Fairall et a/., 1993 Nature (London) 366, 483-487; and Pavletich & Pabo 1993 Science 
261, 1701-1707). Thus the mode of DNA recognition is principally a one-to-one 
interaction between amino acids and bases. Because zinc fingers function as independent 
modules (Miller et a/., 1985 EMBO J. 4, 1609-1614; Hug & Rhodes 1987 Trends 
Biochem. Sci. 12, 464-469), it should be possible for fingers with different triplet 
specificities to be combined to give specific recognition of longer DNA sequences. Each 
finger is folded so that three amino acids are presented for binding to the DNA target 
sequence, although binding may be directly through only two of these positions. In the 
case of Zif268 for example, the protein is made up of three fingers which contact a 9 base 
pair contiguous sequence of target DNA. A linker sequence is found between fingers 
which appears to make no direct contact with the nucleic acid. 

Protein engineering experiments have shown that it is possible to alter rationally the 
DNA-binding characteristics of individual zinc fingers when one or more of the a-helical 
positions is varied in a number of proteins (Nardelli et a/., 1991 Nature (London) 349, 
175-178; Nardelli etaL, 1992 Nucleic Acids Res. 20, 4137-4144; and Desjarlais & Berg 
1992a Proteins 13, 272). It has already been possible to propose some principles relating 
amino acids on the a-helix to corresponding bases in the bound DNA sequence (Desjarlais 
& Berg 1992b Proc. Natl. Acad. Sci. USA 89, 7345-7349). However in this approach 
the altered positions on the a-helix are prejudged, making it possible to overlook the role 
ot positions which are not currently considered important; and secondly, owing to the 
importance of context, concomitant alterations are sometimes required to affect specificity 
(Desjarlais & Berg 1992b), so that a significant correlation between an amino acid and 
base may be misconstrued. 
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To investigate binding of mutant Zf proteins, Thiesen and Bach (1991 FEBS 283 , 23-26) 
mutated Zf fingers and studied their binding to randomised oligonucleotides, using 
electrophoretic mobility shift assays. Subsequent use of phage display technology has 
permitted the expression of random libraries of Zf mutant proteins on the surface of 
bacteriophage. The three Zf domains of Zi£268, with 4 positions within finger one 
randomised, have been displayed on the surface of filamentous phage by Rebar and Pabo 
(1994 Science 263, 671-673). The library was then subjected to rounds of affinity 
selection by binding to target DNA oligonucleotide sequences in order to obtain Zf 
proteins with new binding specificities. Randomised mutagenesis (at the same postions 
as those selected by Rebar & Pabo) of finger 1 of Zif 268 with phage display has also 
been used by Jamieson ei aL, (1994 Biochemistry 33, 5689-5695) to create novel binding 
specif iciry and affinity. 

More recently Wu et aL (1995 Proc. Natl. Acad. Sci. USA 92, 344-348) have made three 
libraries, each of a different finger from Zif268, and each having six or seven a-helical 
positions randomised. Six triplets were used in selections but did not return fingers with 
any sequence biases; and when the three triplets of the Zif268 binding site were 
individually used as controls, the vast majority of selected fingers did not resemble the 
sequences of the wild-type Zif268 fingers and, though capable of tight binding to their 
target sites in vitro, were usually not able to discriminate strongly against different triplets. 
The authors interpret the results as evidence against the existence of a code. 

In summary, it is known that Zf protein motifs are widespread in DNA binding proteins 
and that binding is via three key amino acids, each one contacting a single base pair in the 
target DNA sequence. Motifs are modular and may be linked together to form a set of 
fingers which recognise a contiguous DNA sequence (e.g. a three fingered protein will 
recognise a 9mer etc). The key residues involved in DNA binding have been identified 
through sequence data and from structural information. Directed and random mutagenesis 
has confirmed the role of these amino acids in determining specificity and affinity. Phage 
display has been used to screen for new binding specificities of random mutants of fingers. 
A recognition code, to aid design of new finger specificities, has been worked towards 
although it has been suggested that specificity may be difficult to predict. 
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beads; or affinity chromatography column. Conveniently the sequences are biotinylated. 
Preferably the sequences are contained within 12 mini-libraries, as explained elsewhere. 

In a further aspect the invention provides a zinc finger polypeptide designed by one or 
both of the methods defined above. Preferably the zinc finger polypeptide designed by 
the method comprises a combination of a plurality of zinc fingers (adjacent zinc fingers 
being joined by an intervening linker peptide), each finger comprising a zinc finger 
binding motif. Desirably, each zinc finger binding motif in the zinc finger polypeptide 
has been selected for preferable binding characteristics by the method defined above. The 
intervening linker peptide may be the same between each adjacent zinc finger or, 
alternatively, the same zinc finger polypeptide may contain a number of different linker 
peptides. The intervening linker peptide may be one that is present in naturally-occurring 
zinc finger polypeptides or may be an artificial sequence. In particular, the sequence of 
the intervening linker peptide may be varied, for example, to optimise binding of the zinc 
finger polypeptide to the target sequence. 

Where the zinc finger polypeptide comprises a plurality of zinc binding motifs, it is 
preferred that each motif binds to those DNA triplets which represent contiguous or 
substantially contiguous DNA in the sequence of interest. Where several candidate 
bindins motifs or candidate combinations of motifs exist, these may be screened against 
the acmal target sequence to determine the optimum composition of the polypeptide. 
Competitor DNA may be included in the screening assay for comparison, as described 
below. 

The non-specific component of all protein-DN A interactions, which includes contacts to 
the sugar-phosphate backbone as well as ambiguous contacts to base-pairs, is a 
considerable driving force towards complex formation and can result in the selection of 
DNA-binding proteins with reasonable affinity but without specificity for a given DNA 
sequence. Therefore, in order to rtiinimise these non-specific interactions when designing 
a polvpept.de. selections should preferably be performed with low concentrations of 
specific binding site in a background of competitor DNA, and binding should desirably 
take place in solution to avoid local concentration effects and the avidity of multivalent 
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phage for ligands immobilised on solid surfaces. 

As a safeguard against spurious selections, the specificity of individual phage should be 
determined following the final round of selection. Instead of testing for binding to a small 
number of binding sites, i, would be desirable to screen all possible DNA sequences. 

It has now been shown possible by the present inventors (below) to design a truly modular 
zinc boding polypeptide, wherein the zinc binding motif of each zinc binding finger is 
selected on the basis of its affinity for a particular triplet. Accordingly, it should be well 
wtthm the capability of one of normal skill in the art to design a zinc finger polypeptide 
capable of binding to any desired target DNA sequence simply by considering the 
sequence of triplets present in the target DNA and combining in the appropriate order zinc 
fingers comprising zmc finger binding monfs having the necessary binding characteristics 
to bmd thereto. THe greater the length of known sequence of the target DNA, the greater 
the number of zinc finger binding motifs that can be included in the zinc finger 
polypept.de. For example, if the known sequence is only 9 bases long then three zinc 
fmger binding motifs can be included in the polypeptide. If the known sequence is 27 
bases long then, in theory, up to nine binding motifs could be included in the polypeptide 
The longer the target DNA sequence, the lower the probability of its occurrence in any 
given ponion of DNA. 

Moreover, those motifs selected for inclusion in the polypeptide could be artificially 
mod.fied (e.g. by directed mutagenesis) in order to optimise further their binding 
characters. Alternatively (or additionally) the length and amino acid sequence of the 
linker peptide joining adjacent zinc binding fingers could be varied, as outlined above 
This may have the effect of altering the position of the zinc finger binding motif relative 
to the DNA sequence of interest, and thereby exert a further influence on binding 
characteristics. 



Generally, it will be preferred to select those motifs having high affinity and high 
specificity for the target triplet. 
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In a further aspect, the invention provides a kit for making a zinc finger polypeptide for 
binding to a nucleic acid sequence of interest, comprising: a library of DNA sequences 
encoding zinc finger binding motifs of known binding characteristics in a form suitable for 
cloning into a vector; a vector molecule suitable for accepting one or more sequences from 
the library; and instructions for use. 

Preferably the vector is capable of directing the expression of the cloned sequences as a 
single zinc finger polypeptide. In particular it is preferred that the vector is capable of 
directing the expression of the cloned sequences as a single zinc finger polypeptide 
displayed on the surface of a viral particle, typically of the son of viral display particle 
which are known to those skilled in the art. The DNA sequences are preferably in such 
a form that the expressed polypeptides are capable of self-assembling into a number of 
zinc finger polypeptides. 

It wil be apparent that the kit defined above will be of particular use in designing a zinc 
finger polypeptide comprising a plurality of zinc finger binding motifs, the binding 
characteristics of which are already known. In another aspect the invention provides a kit 
for use when zinc finger binding motifs with suitable binding characteristics have not yet 
been identified, such that the invention provides a kit for making a zinc finger polypeptide 
for binding to a nucleic acid sequence of interest, comprising: a library of DNA 
sequences, each encoding a zinc finger binding motif in a form suitable for screening 
and/or selecting according to the methods defined above; and instructions for use. 

Advantageously, the library of DNA sequences in the kit will be a library in accordance 
with the first aspect of the invention. Conveniently, the kit may also comprise a library 
of 64 DNA sequences, each sequence comprising a different one of the 64 possible 
permutations of three DNA bases, in a form suitable for use in the selection method 
defined previously. Typically, the 64 sequences are present in 12 separate mini-libraries, 
each mini-library having one postion in the relevant triplet fixed and two postions 
randomised. Preferably, the kit will also comprise appropriate buffer solutions, and/or 
reagents for use in the detection of bound zinc fingers. The kit may also usefully include 
a vector suitable for accepting one or more sequences selected from the library of DNA 
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In a further aspect the invention provides a method of altering the expression of a gene 
of interest in a target cell, comprising : determining (if necessary) at least part of the DNA 
sequence of the structural region and/or a regulatory region of the gene of interest; 
designing a zinc finger polypeptide to bind to the DNA of known sequence, and causing 
said zinc finger polypeptide to be present in the target cell, (preferably in the nucleus 
thereof), at will be apparent that the DNA sequence need not be determined if it is 
already known.) 

The regulatory region could be quite remote from the structural region of the gene of 
interest (e.g. a distant enhancer sequence or similar). Preferably the zinc finger 
polypeptide is designed by one or both of the methods of the invention defined above. 

Binding of the zinc finger polypeptide to the target sequence may result in increased or 
reduced expression of the gene of interest depending, for example, on the nature of the 
target sequence (e.g. structural or regulatory) to which the polypeptide binds. 

In addition, the zinc finger polypeptide may advantageously comprise functional domains 
. from other proteins (e.g. catalytic domains from restriction enzymes, recombinases, 

replicases, integrases and the like) or even "synthetic" effector domains. The polypeptide 

may also comprise activation or processing signals, such as nuclear localisation signals. 

These are of particular usefulness in targtetting the polypeptide to the nucleus of the cell 
in order to enhance the binding of the polypeptide to an intranuclear target (such as 
genomic DNA). A particular example of such a localisation signal is that from the large 
T antigen of SV40. Such other functional domains/signals and the like are conveniently 
present as a fusion with the zinc finger polypeptide. Other desirable fusion partners 
comprise immunoglobulins or fragments thereof (eg. Fab, scFv) having binding activity. 

The zinc finger polypeptide may be synthesised in situ in the cell as a result of delivery 
to the cell of DNA directing expression of the polypeptide. Methods of facilitating 
delivery of DNA are well-known to those skilled in the an and include, for example, 
recombinant viral vectors (e.g. retroviruses, adenoviruses), liposomes and the like. 
Alternatively . the zinc finger polypeptide could be made outside the cell and then delivered 
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thereto. Delivery could be facilitated by incorporating the polypeptide into liposomes etc. 
or by attaching the polypeptide to a targerting moiety (such as the binding portion of an 
antibody or hormone molecule). Indeed, one significant advantage of zinc finger proteins 
over oligonucleotides or protein-nucleic acids (PNAs) in controlling gene expression, 
would be the vector-free delivery of protein to target cells. Unlike the above, many 
examples of soluble proteins entering cells are known, including antibodies to cell surface 
receptors. The present inventors are currently carrying out fusions of anti-bcr-abl fingers 
(see example 3 below) to a single-chain (sc) Fv fragment capable of recognising NIP (4- 
hydroxy-5-iodo-3-nitrophenyl acetyl). Mouse transferrin conjugated with NIP will be used 
to deliver the fingers to mouse cells via the mouse transferrin receptor. 

Media (e.g. microtitre wells, resins etc.) coated with NIP can also be used as solid 
supports for zinc fingers fused to anti-NIP scFvs, for applications requiring immobilised 
zinc fingers (e.g. the purification of specific nucleic acids). 

In a particular embodiment, the invention provides a method of inhibiting cell division by 
causing the presence in a cell of a zinc finger polypeptide which inhibits the expression 
of a gene enabling the cell to divide. 

In a specific embodiment, the invention provides a method of treating a cancer, 
comprising delivering to a patient, or causing to be present therein, a zinc finger 
polypeptide which inhibits the expression of a gene enabling the cancer cells to divide. 
The target could be. for example, an oncogene or a normal gene which is overexpressed 

in the cancer cells. 

To the best knowledge of the inventors, design of a zinc finger polypeptide and its 
successful use in modulation of gene expression (as described below) has never previously 
been demonstrated. This breakthrough presents numerous possibilities. In particular, zinc 
finger polypeptides could be designed for therapeutic and/or prophylactic use in regulating 
the expression of disease-associated genes. For example, zinc finger polypeptides could 
be used to inhibit the expression of foreign genes (e.g. the genes of bacterial or viral 
pathogens) in man or animals, or to modify the expression of mutated host genes (such 
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i oncogenes). 

he invention therefore provides a zinc finger polypeptide capable of inhibiting the 
tpression of a disease-associated gene. Typically the zinc finger polypeptide will not be 
naturally-occurring polypeptide but will be specifically designed to inhibit the expression 
f the disease-associated gene. Conveniently the polypeptide will be designed by one or 
oth of the methods of the invention defined above. Advantageously the disease-associated 
ene will be an oncogene, typically the BCR-ABL fusion oncogene or a ras oncogene. In 
particular embodiment the invention provides a zinc finger polypeptide designed to bind 
> the DNA sequence GCAGAAGCC and capable of inihibting the expression of the BCR- 
BL fusion oncogene. 

i yet another aspect the invention provides a method of modifying a nucleic acid sequence 
f interest present in a sample mixture by binding thereto a zinc finger polypeptide, 
omprising contacting the sample mixture with a zinc finger polypeptide having affinity 

for at least a portion of the sequence of interest, so as to allow the zinc finger polypeptide 

to bind specifically to the sequence of interest. 

"he term "modifying" as used herein is intended to mean that the sequence is considered 
lodified simply by the binding of the zinc finger polypeptide. It is not intended to 
uggest that the sequence of nucleotides is changed, although such changes (and others) 
ould ensue following binding of the zinc finger polypeptide to the nucleic acid of interest. 
:onveniently the nucleic acid sequence is DNA. 

4odification of the nucleic acid of interest (in the sense of binding thereto by a zinc finger 
olypepride) could be detected in any of a number of methods (e.g. gel mobility shift 
ssays, use of labelled zinc finger polypeptides - labels could include radioactive, 
luorescent. enzyme or biotin/streptavidin labels). 

/iodification of the nucleic acid sequence of interest (and detection thereof) may be all that 
j required (e.g. in diagnosis of disease). Desirably however, further processing of the 
ample is performed. Conveniently the zinc finger polypeptide (and nucleic acid 
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sequences spedficaJly bound thereto) are separated from the rest of the sample 
Advantageously the zinc finger polypeptide is bound to a solid phase support, to facilitate 
such separation. For example, the zinc finger polypep.de may be present in an 
acrvlamtde or agarose gel matrix or, more preferably, is immobilised on the surface of 
a membrane or in the wells of a microtitre place. 

Possible uses of suitably designed zinc finger polypeptides are: 

a) Therapy (e.g. targetting to double stranded DNA) 

b) Diagnosis (e.g. detecting mutations in gene sequences- 

the present wo* has shown that "tailor made" zinc finger polypeptides can distinguish 

sequences differing by one base pair). 
«) DNA purification c— *c f,n e «, polypeptide cou!d be used ,o purify restrict 
frag™,* from solute, or ,„ visualise DNA fragments on a ge! [for example, where ,he 
polypepnde is Unjced „ an appropriate fusion panner. or is deteced by probing „i,h an 
antibody]). 

In addition, zinc finger polypeprides could even be nrgeKd «, other nucleic acids such as 
ss or ds RNA (e.g. self-complemenury RNA such as is presen. in many RNA molecules, 
"r .o RNA-DNA hybrids, which wou.d presen, anoiher possible mechantsm of affecting 
cellular events at the molecular level. 

>n Example 1 the invemors describe and successfully demons,ra,e me us. of ,he phage 
d.splay technique to construe, arrf screen a random zinc finger binding mouf library, using 
a defined oligonucleotide large, sequence. 

in Example 2 is disclosed tf,e ana!,* of zinc finger binding moof sequences seleaed by 
•>* screening procedure of Example , , the DNA-specificity of ,he modfs being srudied by 
tandmg ,„ a mim-libraty of randomised DNA targe, sequences to reveal a panern of 
acceptable bases a, each poshion in ,he urge, Hiple, . a "binding si,e signature". 

In Example 3, me findings of the firs, rwo sections are used to selec, and modify rationally 
> *nc fmger binding polypepude in order u, bind ,o a panicular DNA urge, with high 
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affinity: U is convincingly shown that the peptide binds to the target sequence and can 
modify gene expression in cells cultured in vitro. 

Example 4 describes the development of an alternative zinc finger binding motif library. 

Example 5 describes the design of a zinc finger binding polypeptide which binds to a DNA 

sequence of special clinical signiGcance. 

The invention will now be further described by way of example and with reference to the 

accompanying drawings, of which: 

Figure 1 is a schematic representation of affinity purification of phage particles displaying 
zinc finger binding motifs fused to phage coat proteins; 

Figure 2 shows three amino acid sequences used in the phage display library; 

Figure 3 shows the DNA sequences of three oligonucleotides used in the affinity 
purification of phage display particles; 

Figure 4 is a "checker board" of binding site signatures determined for various zinc finger 

binding motifs; 

Figure 5 shows three graphs of fractional saturation against concentration of DNA (nM) 
for various binding motifs and target DNA triplets; 

Figure 6 shows the nucleotide sequence of the fusion between BCR and ABL sequences in 
pl90 cDNA and the corresponding exon boundaries in the BCR and ABL genes; 

Figure 7 shows the amino acid sequences of various zinc finger binding motifs deigned 
to test for binding to the BCR1ABL fusion; 



Figu 



re 8 is a graph of peptide binding (as measured by A*,, . ^) against DNA 
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concern,*™ <, M ) of arse, „ r DNA 

cm ™, phem , ^ (CA?) ^ ^ rejuiB ^ ^ 

the lower panel as a bar chart; 



a 
in 



Figure 11 is a 
cells; 



graph showing vUMf ^ ^ for va[ious 



Figures 13 and 14 illustrate schematirallv H.ff 

u- ^- scnematically different methods of designing zinc fi n .« 

binding polypeptides; and "signing zinc finger 

Figure 15 shows the amino acid seouence of ™, r 

■o a „ dna ^ (a ™:r M a **** ~~ - bw 

Example 1 

DNA o on by ,„ c fins „ bindjns ^ ^ J 

anc finger binding motifs displayed on the surface „f . 

Angers capabie of binding „ ! lve „ DN ' t , ° len ° P, " 8e ° f 

fibers whieh h „ „ " ^ m ™ M< ^ n « s ° f 

berween w 7 *" ra,,<>MliSed "» ° f «— 
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proteins is the cloning of peptides (Smith 1985 Science 228, 1315-1317), or protein 
domains (McCafferty * al.. 1990 Nature (London) 348, 552-554; Bass e, al, 1990 
Proteins 8, 309-314), as fusions to the minor coat protein ( P HI) of bacteriophage fd. wruch 
leads to their expression on the tip of the capsid. Phage displaying the peptides of mterest 
can then be affinity purified and amplified for use in further rounds of selection and for 
DNA sequencmg of the cloned gene. The inventors applied this technology to the study 
of zinc finger-DNA interactions alter demonstrating that functional zinc finger protems can 
be displaved on the surface of fd phage, and that the engineered phage can be captured on 
a solid support coated with specific DNA. A phage display library was created 
comprising vanants of the middle finger from the DNA binding domain of Zt068 (a 
mouse transcription factor combing 3 zinc fingers - Christy « al. , 1988). DNA of fixed 
sequence was used to purify phage from this library over several rounds of selecnon, 
returning a number of different but related zinc fingers which bind the given DNA. By 
comparing similarities in the amino acid sequences of functionally equivalent fingers we 
deduce the likely mode of interaction of these fingers with DNA. Remarkably, it would 
appear that many base contacts can occur from three primary positions on the a-hehx of 
the zinc finger, correlating (in hindsight) with the implications of the crystal structure of 
Zif268 bound to DNA (Pavletich & Pabo 1991). The ability to select or destgn zinc 
fingers with desired specificity means that DNA binding proteins containing zmc fingers 
can now be "made-to-measure". 

MATERIALS AND METHODS 

Construction and donmg of genes. The gene for the first three fingers (residues ,-101) 
of Transcription Factor II1A (TFIIIA) was amplified by PGR from the cDNA clone of 
TFIIIA usine forward and backward primers which contain restriction sites tor Natl and 
Sfil respectively. The gene for the Zif268 fingers (residues 333-420) was assembled from 
8 overlapping synthetic oligonucleotides, giving Sfil and Afarl overhangs. The genes for 
fingers of the phage library were synthesised from 4 oligonucleotides by direcuonal end 
» end ligation using 3 short complementary linkers, and amplified by PGR from the single 
strand using forward and backward pnmers which contained sites for Soil and Sfil 
respectivelv Backward PCR primers in addition introduced Met-Ala-Glu as the first three 
amino acuis of the zmc finger peptides, and these were followed by the residues of the 
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wild type or library fingers as discussed in the text. Cloning overhangs were produced 
by digestion with Sfil and Notl where necessary. Fragments were ligated to l^g similarly 
prepared Fd-Tet-SN vector. This is a derivative of fd-tet-DOGl (Hoogenboom ex al. % 
1991 Nucleic Acids Res. 19, 4133^137) in which a section of the pelB leader and a 
restriction site for the enzyme Sfil (underlined) have been added by site-directed 
mutagenesis using the oligonucleotide (Seq ID No. 1): 

5' CTCCTGCAGTTGGACCTGTGCCATGGCCG 
GCTGGGC CGC ATAG AATGGAAC A ACTA AAOC 3' 

which anneals in the region of the poly linker, (L. Jespers, personal communication). 
Electrocompetent DH5a cells were transformed with recombinant vector in 200ng 
aliquots, grown for 1 hour in 2xTY medium with 1% glucose, and plated on TYE 
containing lSjxg/ml tetracycline and 1% glucose. 

Figure 2 shows the amino acid sequence (Seq ID No. 2) of the three zinc fingers from 
Zif268 used in the phage display library. The top and bottom rows represent the sequence 
of the first and third fingers respectively. The middle row represents the sequence of the 
middle finger. The randomised positions in the a-helix of the middle finger have residues 
marked 'X'. The amino acid positions are numbered relative to the first helical residue 
(position 1). For amino acids at positions -1 to +8, excluding the conserved Leu and His, 
codons are equal mixtures of (G,A,C)NN: T in the first base position is omitted in order 
to avoid stop codons, but this has the unfortunate effect that the codons for Trp, Phe, Tyr 
and Cys are not represented. Position +9 is specified by the codon A(G,A)G, allowing 
either Arg or Lys. Residues of the hydrophobic core are circled, whereas the zinc ligands 
are written as white letters on black circles. The positions forming the 0-sheets and the 
a-helix of the zinc fingers are marked below the sequence. 

Phage selection. Colonies were transferred from plates to 200ml 2xTY/Zn/Tet (2xTY 
containing 50^M Zn(CH3.C00) ; and IS^g/ml tetracycline) and grown overnight. Phage 
were purified from the culture supernatant by two rounds of precipitation using 0.2 
volumes of 20% PEG/2. 5M NaCl containing SO^tM Zn(CH3.C00) : , and resuspended in 
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zinc finger phage buffer (20mM HEPES pH7.5, 50mM NaCl. ImM MgCl 2 and 50 M M 
Zn(CH3. COO)^. Streptavidin-coated paramagnetic beads (Dynal) were washed in zinc 
finger phage buffer and blocked for 1 hour at room temperature with the same buffer 
made up to 6% in fat-free dried milk (Marvel). Selection of phage was over three rounds: 
in the first round, beads (1 mg) were saturated with biotinylated oligonucleotide ( - 80nM) 
and then washed prior to phage binding, but in the second and third rounds 1.7nM 
oligonucleotide and 5/xg poly dGC (Sigma) were added to the beads with the phage. 
Binding reactions (1.5ml) for 1 hour at 15°C were in zinc finger phage buffer made up 
to 2% in fat-free dried milk (Marvel) and 1 % in Tween 20, and typically contained 5xl0 u 
phage. Beads were washed 15 times with 1ml of the same buffer. Phage were eluted by 
shaking in 0. 1M triethylamine for 5min and neutralised with an equal volume of 1M Tris 
pH7.4. Log phase £. coli TGI in 2xTY were infected with eluted phage for 30min at 
37°C and plated as described above. Phage titres were determined by plating serial 
dUutions of the infected bacteria. 

The phage selection procedure, based on affinity purification, is illustrated schematically 
in Figure 1 : zinc fingers (A) are expressed on the surface of fd phage(B) as fusions to the 
the minor coat protein (C). The third finger is mainly obscured by the DNA helix. Zinc 
finger phage are bound to 5 '-biotinylated DNA oligonucleotide [D] attached to 
streptavidin-coated paramagnetic beads [E], and captured using a magnet [F], (Figure 
adapted from Dynal AS and also Marks ex al. (1992 J. Biol. Chem. 267, 16007-16105). 

Figure 3 shows sequences (Seq ID No.s 3-8) of DNA oligonucleotides used to purify (i) 
phage displaying the first three fingers of TFHIA, (ii) phage displaying the three fingers 
of Zif268, and (iii) zinc finger phage from the phage display library. The Zif268 
consensus operator sequence used in the X-ray crystal structure (Pavletich & Pabo 1991 
Science 252, 809-817) is highlighted in (ii), and in (iii) where "X" denotes a base change 
from the ideal operator in oligonucleotides used to purify phage with new specificities. 
Biotinylation of one strand is shown by a circled "B w . 

Sequencing of selected phage. Single colonies of transformants obtained after three 
rounds of selection as described, were grown overnight in 2xTY/Zn/Tet. Small aliquots 
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of the cultures wm SIOred jn u% ^ a _ 20 . c _ 

Sntgle-stranded DNA was prepa „ d from phase fa fc ^ superaauni ^ ^ 
uang the Sequenase™ 2.0 Idl (U.S. Biochemical Corp.). 

SUITS AMp pjcz-T^^, 

It* display „f „ DNA . Bindill| Domains fron TFTIU or m Pri „ r » „. 

*» fuliy funcona, *„ c fl „ 8ers ^ „ ^ ^ ^ ^ ^ 

wh«„ cloned in m, ve c.or Fd-Tet-SN. ,n preliminary ^ ^ 

- -ns ,o *D firstly N . ttnnjMl ^ from mi[A ^ ^ ^ 

ceij 39, 479-489), and secondly the three fingers from 7in*R tru - , ,™ 

,cc " n S ers "om Zif268 (Chnsty et al, 1988), for 

: ii: t na btodins - - p — *- - - -~ « p-n 

I 71, ^ " « - '"0 Md. 

Z^i Am °* a "° ,y 10 - 20% of ^ pm in ~ 

present as fusion protein. 

"» S e di S p,. y i ng either se, o( flngers w „ e ^ of ^ ^ 
ol.gonuCeo.des. indicating ma, a„c ^ were e:[pressed ^ ^ ^ fc 
."s^ces. Paramagnetic beads coated wim specific oligonucle„,ide were used as a 
n>ed,um on which ,o capture DNA-binding phage, and were consist ab!e . return 

-■specific DNA. ^rnanvely. when phage discing ,he ,hre. f„ S e re 0 , Zi G68 were 
=1.*,, wnh Fd-Te-SN phage no. hearing zinc fingers, and th. mi^re 
~ with heads coa,ed wim Zif268 „pera„r DNA. one in three of .he , oat phage 
el».ed and transfeced i„,o E. c* were shown by co, 0 „v hybridisation to carry the ZiPtf 
gene. ,„dica„„g an .n^,,, facmr of ^ 500 for fc ^ ^ ^ ^ - ^ 

dear ma, a „c fingers displayed on fd phage are capable „, pr e f e,e„,ia, bi„d ing t0 DNA 
fences wi, h which they can form specific complies, making possible .he enrichmen. 
0! wanced phage by facors of „p ,o 500 in a single afti-nry purificarion s K p. Therefore 
over muhiple rounds of selection and ampHficanon. very rare clones capable of 
sequence-specific DNA binding can be seleced-from a large library 
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A phase dispfcy MT - * f ™" ZiC68 ' ™ inVen,0rS ^ ■* ' "T 

display library of fte three fu.gr, of ZU268 in which selected M * * m.dd e 

are randomised ffl- ». - *" «*» ^ * T ,» 

Jired specific* using a modified ZH268 op-or «,uenc* <P-» 4 

Pro c NaU. Acad. Sci. USA 86, 8737-8741, in which fte middle DNA mple. » ahere 

. fte sequence of I* (Fi,- re 3). ~* ^ 

— purative ha. recognition posies which are suggest by database ana*»s 

(Jacobs .992 EMBO J. 11. 4507^517), fte M- have designed fte horary - *• 

middle fina« so fta, rerative ,o fte firs, residue in fte a-helix <P~» ^ 

..a i and His can be any amino acid except r ne, 
-1 to +8, but excluding the conserved Leu and His, canoe y 

Tyr Trp and Cys which occur omy rare,, a, those positions (Jacobs 1993 Ph.D. ftesrs 
UuiversUy of Cambridge), in addition, ft. invent have allowed position + 9 (wmch 
m,gh, make an imer-finger contact wift Ser ac position -2 (Pavietich * Pabo 19,1), - he 
either Arg or Lys. ft. mo mos, frequentiy occurring residues a, ft. posmon. 

T. logic of ftis prorocoi, based upon fte Zif268 crysral srrucrure (Pavietich 4 N. 
„91) is ft. fte randomised finger is directed .0 fte central mple. since fte overall 
regrsrer of proxin-DNA comae* is fixed by t. ~ neighbours. Tins aUows fte 
examination of which amino acids in fte randomise finger are fte mos, importi*«m 
formmg specific complexes wift DNA of known sequence. Since comprehensive 
variations are programmed in aU fte putative conrac, positions o, fte a-hetix. . . posab* 
» conduc, an objective srudy of fte imporunce of each position in DNA-b,nd,n g (Jacobs 

1992). 

The srze of >he phage display library required, assuming fuU degeneracy of fte 8 variable 
positions, is (16' x 2',= 5.4 x W, bur because of practica, 

of rransfonnation wift Fd-Tet-SN, fte invents were abie » done only 2.6x10 ft-. 
The library used is fterefore some n- hundred times smaller ftan fte fteorenca, , - 
aecessarv » cover a,, ft. possib,. variations of fte a-hehx. Desphe tins shorrfall. n ha, 
been possible .0 isoiate phage which bind wift high affinhy and specific, » g,ve» DNA 
sequences, demonstrating fte remarkable versatile of fte zinc finger motif. 
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acid-base contacts in zinc finger-DNA complexes deduced from phage display 
selection. Of the 64 base triplets that could possibly form the binding site for variations 
of finger 2, the inventors have so far used 32 in attempts to isolate zinc finger phage as 
described. Results from these selections are shown in Table 1, which lists amino acid 
sequences of the variant a-helical regions from clones of library phage selected after 3 
rounds of screening with variants of the Zi£268 operator. 
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In Table 1, the amino acid sequences, aligned in the one letter code, are listed alongside 
the DNA oligonucleotides (a to p) used in their purification. The latter are denoted by the 
sequence of the central DNA triplet in the "bound" strand of the variant Zif268 operator. 
The amino acid positions are numbered relative to the first helical residue (position 1), and 
the three primary recognition positions are highlighted. The accompanying numbers 
indicate the independent occurrences of that clone in the sequenced population (5-10 
colonies); where numbers are in parentheses, the clone(s) were detected in the penultimate 
round of selection but not in the final round. In addition to the DNA triplets shown here, 
others were also used in attempts to select zinc finger phage from the library, but most 
selected two clones, one having the a-helical sequence KASNLVSHIR, and the other 
having the sequence LRHNLETHMR. Those triplets were: ACT, AAA, ill, CCT, 
CTT, TTC, AGT, CGA, CAT, AGA, AGC and AAT. 

In general the inventors have been unable to select zinc fingers which bind specifically to 
triplets without a 5' or 3' guanine, all of which return the same limited set of phage after 
three rounds of selection (see). However for each of the other triplets used to screen the 
library, a family of zinc finger phage is recovered. In these families is found a sequence 
bias in the randomised a-helix, which is interpreted as revealing the position and identity 
of amino acids used to contact the DNA. For instance: the middle fingers from the 8 
different clones selected with the triplet GAT (Table Id) all have Asn at position +3 and 
Arg at position +6, just as does the first zinc finger of the Drosophila protein tramtrack 
in which they are seen making contacts to the same triplet in the cocrystal with specific 
DNA (Fairall et a/., 1993). This indicates that the positional recurrence of a particular 
amino acid in functionally equivalent fingers is unlikely to be coincidental, but rather 
because it has a functional role. Thus using data collected from the phage display library 
(Table 1) it is possible to infer most of the specific amino acid-DNA interactions. 
Remarkablv, most of the results can be rationalised in terms of contacts from the three 
primary a-helical positions (-1, +3 and +6) identified by X-ray crystallography (Pavletich 
& Pabo 1991) and database analysis (Jacobs 1992). 

As has been pointed out before (Berg 1992 Proc. Natl. Acad. Sci. USA 89, 11109-11110), 
guanine has a particularly important role in zinc finger-DNA interactions. When present 
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- the 5' (e.g. Table loi) or 3' (e.g. Table lm-o) end of a triplet. G selects fingers with 
Arg at posttion +6 or -1 of the a-helix respectively. When G is present in the middle 
poaaon of a triplet (e.g. Table lb), the preferred amino acid at position +3 is His 
Occasionally, G at the 5' end of a triplet selects Ser or Thr at + 6 (e.g. Table lp). Since 
G can only be specified absolutely by Arg (Seeman a a/., 1976 Proc. Nat. Acad Sci 
USA 73 , 804-808), this is the most common determinant at -1 and +6. One can expect 
*» type of contact to be a bidemate hydrogen bonding interaction as seen in the crvstal 
structures of ZiP_68 (Pavletich * Pabo 1991 Science 252, 809-817) and tramt ra ck (Fa'iral. 
* «/-. 1993). In these structures, and in almost all of the selected fingers in which Arg 
recognises G at the y end. Asp occurs at position + 2 to buttress the long Arg side chain 
(e-g. Table lo,p). When position -1 is not Arg, Asp rarely occurs at + 2. suggesting that 
» tins case any other contacts it might make with the second DNA strand do not 
contnbute significantly to the stability the protein-DNA complex. 

Adenine is also an important determinant of sequence specificity, recognised almost 
exclusively by Asn or Gin which again are able to make bidemate contacts (Seeman et al 
1976). When A is present at the 3' end of a triplet, Gin is often selected at position -i 
of the a-helix, accompanied by small aliphatic residues at + 2 (e.g. Table lb). Adenine 
«n the mtddle of the triplet strongly selects Asn at + 3 (e.g. Table Ic-e), except in the 
tnple, CAG (Table la) which selected only two types of finger, both wnh His at + 3 (one 
bemg the wild-type ZiP.68 which contaminated the library during this experiment) The 
triplets ACG (Table lj) and ATG CTable Ik), which have A at the 5' end, also returned 
ohgoclonal mixtures of phage, the majority of which were of one clone with Asn at +6 . 

In theory, cytosine and thymine cannot reliably be discriminated by a hvdrogen bonding 
ammo acid side chain in the major groove (Seeman e: al., 1976). Nevertheless, C in the 
posuton of a triplet shows a marked preference for Asp or Glu at position -1, together 
with Arg at + 1 (e.g. Table le-g). Asp is also sometimes selected at +3 and +6 when 
C ts m the middle (e.g. Table lo) and 5' (e.g. Table la) position respectively. Although 
A*p can accept a hydrogen bond from the amino group of C. one should note that the 
posmve molecular charge of C in the major groove (Hunter 1993 J. Mol. Biol 230 
1025-1054) will favour an interaction with Asp regardless of hydrogen bonding contacts' 
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owever, C in fte middle position most frequently select Tnr (e.g. Table 10. VU or Uu 
. . Table lo) » +3- Similarly, T in the middle position ">°» <*» «*« S=r (e.g. 
abie »). Ala or Val (e. g . Table lp> a, ♦ J. Tne ahphatic amino acids are unable to make 
ydrogen bonds bo. Ala probably has a hydrophobic interaction with ft. methyl group of 
, whereas a longer side chain such as U» can exclude T and pack against .he ring of C. 
/hen T is a. the 5' end of a triple, Ser and Thr are se toed a, + 6 (as is occas,onally the 
„ for O a, ft. 5' end). Thymine a. the 3' end of a triple, selec* a variety o polar 
mino acids at -1 (e.g. Table Id), and occasionally returns fingers with Ser a. + : (e.g. 
able U) which cou!d make a comae, as seen in ft. crysal struck (Fauall 

ra/., 1993). 

statics of phage display From Table 1 it can b. *en ft., a consensus 
s„a«v occurs in two of ft. ft- primary positions (-1, + 3 and +6) for any fam,.y of 
ouivalen, flng.rs. suggesting tha, in man, cases phage selection is by vinue of only n™ 
ase comae* per finger, as is observed in the ZiDoS c*sul »ructure (Pavletich & Pabo 
MM). Accordingly, identical finger sequences are often renamed b, DNA sequences 
differing by one base in ft. central triple, One reason for this is that the phage display 
seLction, being essentially purification by affinity, cu yield zinc fingers which bmd 
oually tigh.lv to a number of DNA triples and so are unable .o discriminau. Secondly, 
,„« complex formation is govern.d by ft. law of m*s action, affinity selection can 
,v„ur ftose clones whose representation in the library is greatest even though ftetr true 
(fini., for DNA is less ftan ,ha. of other clones less abundant in the library. Phage 
ispuv section by affinity is therefore of limited value in distinguishing between 
ermissive and specific interactions beyond those base contact necessary » sub,l.se *« 
omplex. Thus in the absence of competition from fingers which are able . bmd 
pecificall, . a given DNA, .he tightest non-specific complexes wi„ be select from fte 
*age Hbrarv. Consequently, result obtained b, phage display section from a hbrary 
,„st be confirm.- by specificity assays, particularly when fta. library is of limited s,*. 

delusion. The amino acid sequence b.ases observed wiftin a family of functionally 
quivalen, zinc fingers indicate fta.. of ft. »-hebcal positions randomise in tins *udy. 
,nly ftree primary (-1. + 3 and +«> and one auxiliary « + 2) positions are involvrf . the 
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recognition of DNA. Moreover, a limited set of amino acids are to be found at those 
positions, and it is presumed that these make contacts to bases. The indications therefore 
are that a code can be derived to describe zinc finger-DNA interactions. At this stage 
however, although sequence homologies are strongly suggestive of amino acid preferences 
for particular base-pairs, one cannot confidently deduce such rules until the specificity of 
individual fingers for DNA triplets is confirmed. The inventors therefore defer making 
a summary table of these preferences until the following example, in which is described 
how randomised DNA binding sites can be used to this end. 

While this work was in progress, a paper by Rebar and Pabo was published (Rebar & 
Pabo 1994 Science 263, 671-673) in which phage display was also used to select zinc 
fingers with new DNA-binding specificities. These authors constructed a library in which 
the first finger of Zif268 is randomised, and screened with tetranucleotides to take into 
account end effects such as additional contacts from variants of this finger. Only 4 
positions (-1, +2, +3 and + 6) were randomised, chosen on the basis of the earlier X-ray 
crystal structures. The results presented above, in which more positions were randomised, 
to some extent justifies Rebar and Pabo's use of the four random positions without 
apparent loss of effect, although further selections may reveal that the library is 
compromised. However, randomising only four positions decreases the theoretical library 
size so that full degeneracy can be achieved in practice. Nevertheless the inventors found 
that the results obtained by Rebar and Pabo by screening their complete library with two 
variant Zif268 operators, are in agreement with their conclusions derived from an 
incomplete library. On the one hand this again highlights the versatility of zinc fingers 
but, remarkably, so far both studies have been unable to produce fingers which bind to 
the sequence CCT, It will be interesting to see whether sequence biases such as we have 
detected would be revealed, if more selections were performed using Rebar and Pabo's 
library. In any case, it would be desirable to investigate the effects on selections of using 
different numbers of randomised positions in more complete libraries than have been used 
so far. 

The original position or context of the randomised finger in the phage display library 
might bear on the efficacy of selected fingers when incorporated into a new DNA-binding 
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domain. Selections from a library of the outer fingers of a three finger peptide (Rebar & 
Pabo, 1994 Science 263, 671-673; Jamieson et al. , 1994 Biochemistry 33, 5689-5695) are 
capable of producing fingers which bind DNA in various different modes, while selections 
from a library of the middle finger should produce motifs which are more constrained. 
Accordingly, Rebar and Pabo do not assume that the first finger of Zi£268 will always 
bind a triplet, and screened with a tetranucleotide binding site to allow for different 
binding modes. Thus motifs selected from libraries of the outer fingers might prove less 
amenable to the assembly of multifinger proteins, since binding of these fingers could be 
perturbed on constraining them to a particular binding mode, as would be the case for 
fingers which had to occupy the middle position of an assembled three-finger protein. In 
contrast, motifs selected from libraries of the middle finger, having been originally 
constrained, will presumably be able to preserve their mode of binding even when placed 
in the outer positions of an assembled DNA-binding domain. 

Figure 13 shows different strategies for the design of tailored zinc finger proteins. (A) 
A three-finger DNA-binding motif is selected en bloc from a library of three randomised 
fingers. (B) A three-finger DNA-binding motif is assembled out of independently selected 
fingers from a library of one randomised finger (e.g. the middle finger of Zi£268). (C) 
A three-finger DNA-binding motif is assembled out of independently selected fingers from 
three positionally specified libraries of randomised zinc fingers. 

Figure 14 illustrates the strategy of combinatorial assembly followed by en bloc selection. 
Groups of triplet-specific zinc fingers (A) isolated by phage display selection are 
assembled in random combinations and re -displayed on phage (B). A full-length target 
site (C) is used to select en bloc the most favourable combination of fingers (D). 
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Example 2 

> - technic,. ,o dea, efficient* ^ „. Mject]on 
■nding Site for a given , nc a „« er (e „ Ihe _ A 

*«-* - a safeguard against spurious K , ecnons baMd on ]*» » 

here ,0 te specjficiIy „ flngers 

The _ found „,„ _ „ f ^ ^ ^ w ^ **• 

POSltlOn Of the Coonat* tri^L. -n. - - ^ H eaCh 

d—,e beWMn close , y reUttd , . p)eB ^ wd 

1 IT o « finger phage display , wou]d provjde ^ eq ^ 

™ which the same mlM „ ^ ^ - 

-re, method for binding sj(e ^ 

amphf.car™ of target DNA flowed c, syncing). has ^ . ^ ~ 
Acds Res. 18 . , 203 . 320 o ; Pollock & Trejsmai ltw Nucie . c ^ ^ is si97 ^ 

presents a convex and rapid new melhod 

r r : r bmdins ^ * - — *- - 
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finger, provided thai the laaer can lolerate ih. rf.r au 

can tolerate the deftned base pair. Each anc finger phage 



9606 1 66 A 1 | > 



PCT/GB95/01949 

WO 96/06166 3 0 

is screened against a.. P- libraries individually immobilised in wells of a — 

a, each posinon is disclosed, which * inventors term a "binding sue -P— • 
::l^n — d in a binding sit. signature — the repertoire of btndmg 

sites recognised by a zinc finger. 

• ~ ««o#»r nhace selected as described in 
The binding site signatures obuuned. us.ng zmc tager I** *' 
exam* , reveal to the sel«rio„ has yieided some highly seouence-specfic anc ftnger 
example i, measurements 
binding morifs which discriminate ai all three posmons of a mplet. From m 

Librium dissociation constants » * found to these fingers bind 

Jcated in tor signal, and discriminate agains, closely retated sues <usua«, by . 

a factor o, ten,. The binding site signatures aliow progress towards a specter, 
code for the interactions of zinc fingers with DNA. 

MATERIALS AND METHODS 

S1U flat-bonomed 96-weU micros plates (Falcon) wete 

^overnight a, «-C wtth srrep.vidin (O.lmg/m, in 0,M NaHCO pH^* 

coning 2% fat-free dried mi* (Marvel), washed 3 times with PBS,Zn _ , 
Tween. and another 3 times with PBS/Z, The W strand of each o tgonucle n* 
, ibrarv was made syntoically and the oto strand emended from a 5 -btonnylat* 
universa, prime, using DNA polymer . flOenow fragment,. Fi.,,n re»ons w« 
^ to 1. <0.S pmole DNA library in each, ,n PBS/Zn for U minutes, the. wash* 
unce wtth PBS/Zn — g 0,* Tween, and once again with PBS/Zn. «« 
^ cultures each containing . selected nnc finger phage were grown . 
Tin, 50mM Zn(CH3.C0O, and »„g/ml tetracychne a, 3CC. CuUure su^ans 
coning phage were diluted tenfold by the addition of PBS/Zn j» 
dHed mi* (Marve,,, 1. Tween and 20 ,g/m, sonicated salmon sperm DNA. W * 
phage solutions (50,1, were applied to wells and binding allowed to proceed .or one hour 
n 4c Unbound phage were removed by washing 5 times with PBS/Zn contatmng % 

, • h r.KS/7,, Bound phage were detected as described 

Tween. and then 3 times wtth PBS/Zn. Bound pnag 

previously (OHM. . - .. »* EMBO J. in press,, or using HRP-conmgated anu-MU 
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IgG (Pharmacia), and quamitated using SOFTmax 2.32 (Molecular Devices Corp). 

The results are shown in Figure 4, which gives the binding site signatures of individual 
zinc finger phage. The figure represents binding of zinc finger phage to randomised DNA 
immobilised in the wells of microtitre plates. To test each zinc finger phage against each 
oligonucleotide library (see above), DNA libraries are applied to columns of wells (down 
the plate), while rows of wells (across the plate) contain equal volumes of a solution of 
a zinc finger phage. The identity of each library is given as the middle triplet of the 
"bound" strand of Zif268 operator, where N represents a mixture of all 4 nucleotides. 
The zinc finger phage is specified by the sequence of the variable region of the middle 
finger, numbered relative to the first helical residue (position 1), and the three primary 
recognition positions are highlighted. Bound phage are detected by an enzyme 
immunoassay. The approximate strength of binding is indicated by a grey scale 
proportional to the enzyme activity. From the pattern of binding to DNA libraries, called 
the "signature" of each clone, one or a small number of binding sites can be read off and 
these are written on the right of the figure. 

Determination of apparent equilibrium dissociation constants. Overnight bacterial 
cultures were grown in 2xTY/Zn/Tet at 30° C. Culture supematants containing phage 
were diluted twofold by the addition of PBS/Zn containing 4% fat-free dried milk 
(Marvel), 2% Tween and 40 ^g/ml sonicated salmon sperm DNA. Binding reactions, 
containing appropriate concentrations of specific 5'-biotinylated DNA and equal volumes 
of zinc finger phage solution, were allowed to equilibrate for Ih at 20° C. All DNA was 
captured on streptavidin-coated paramagnetic beads (SOO^ug per well) which were 
subsequently washed 6 times with PBS/Zn containing 1% Tween and then 3 times with 
PBS/Zn. Bound phage were detected using HRP-conjugated anti-M13 IgG (Pharmacia) 
and developed as described (Griffiths et a/., 1994). Optical densities were quamitated 
using SOFTmax 2.32 (Molecular Devices Corp). 

The results are shown in Figure 5, which is a series of graphs of fractional saturation 
against concentration of DNA (nM). The two outer fingers carry the native sequence, as 
do the the two cognate outer DNA triplets. The sequence of amino acids occupying 
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helical positions -1 to -1-9 of the varied finger are shown in each case. The graphs show 
that the middle finger can discriminate closely related triplets, usually by a factor of ten. 
The graphs allowed the determination of apparent equilibrium dissociation constants, as 
below. 

Estimations of the K„ are by fitting to the equation K rf =[DNA].[P]/[DNA.P], using the 
KaleidaGraph™ Version 2.0 programme (Abelbeck Software). Owing to the sensitivity 
of the ELISA used to detect protein-DNA complex, the inventors were able to use zinc 
finger phage concentrations far below those of the DNA, as is required for accurate 
calculations of the K*. The technique used here has the advantage that while the 
concentration of DNA (variable) must be known accurately, that of the zinc fingers 
(constant) need not be known (Choo & Klug 1993 Nucleic Acids Res. 21, 3341-3346). 
This circumvents the problem of calculating the number of zinc finger peptides expressed 
on the tip of each phage, although since only 10-20% of the gene III protein (pill) carries 
such peptides one would expect on average less than one copy per phage. Binding is 
performed in solution to prevent any effects caused by the avidity (Marks et a/., 1992) of 
phage for DNA immobilised on a surface. Moreover, in this case measurements of by 
ELISA are made possible since equilibrium is reached in solution prior to capture on the 
solid phase. 

RESULTS AND DISCUSSION 

The binding site signature of the second finger of Zif268. The top row of Figure 4 
shows the signature of the second finger of wild type Zi£268. From the panern of strong 
signals indicating binding to oligonucleotide libraries having GNN, TNN, NGN and NNG 
as the middle triplet, it emerges that the optimal binding site for this finger is T/G,G,G, 
in accord with the published consensus sequence (Christy & Nathans 1989 Proc. Natl. 
Acad. Sci. USA 86, 8737-8741). This has implications for the interpretation of the X-ray 
crystal structure of Zif268 solved in complex with consensus operator having TGG as the 
middle triplet (Pavletich & Pabo 1991). For instance, His at position +3 of the middle 
finger was modelled as donating a hydrogen bond to N7 of G, suggesting an equivalent 
contact to be possible with N7 of A, but from the binding site signature we can see that 
there is discrimination against A. This implies that the His may prefer to make a 
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hydrogen bond „ 06 of 0 „r a biftlrcaled hydrogen ^ „ ^ ^ ^ ^ ^ 
-ne Cash with ,„e amino gr011p of A may ^ a ^ ^ ^ ^ 

Thus by considering ft. stereochemistry of double „ e|ica] DNA ^ j ^ 

"> »ve insight i„, 0 the details of ^ fin|e[ . DNA inttractjons 

Amino acid-bas. c .nUc B in rm , er . DNA ^ ^ ^ 

The binding site si8namres of oU]er ^ fingers ^ ^ 

-iecnons performed in exampie , yieidcd ^ seouence-specific DNA binding protein! 
Wo ft are ab|e ,„ s ^ fy , ^ ^ fw ^ mjddie ^ ^ ^ 

a* u,,„d,g ... Md ar5 ^ morc ^ ^ . ^ ^ ^ ta 

°" e ™ ideMify "* fmgers which ■ ~ 

! f IT 10 " y a specUfc base 31 a deflne<i ^ * *» * 

IT , COn,Pa ™ S ° f *■» - - *** any 

-dues wntcn b a ve genuine preferences for pardcuiar ba ses on bound DNA. « . Iew 

«=epttons. these are as previous,, predicted on the basis of phag e disp, ay , ar* „ 
sumraansed in Table 2. 

2 summarises frequently observed amino acid-base contacts in interacdons of 
*>e«ed »nc fingers wift DNA. He given contacts comprise a -syUabic- recognition 
code for appropriate tripiets. Cognate ^ ^ Md ^ ^ ^ ^ 

-red ,„ a reiarm, each base to each position of a tripie, Auxihary amino acids 
from pos,„on + 2 can enhance or modulate specificity of ammo acids a, position ., and 
tfcse are Itsted as pairs. Ser or Thr a, position + 6 permit Asp + 2 of fte flowing flng er 
(denoted Asp ++2) t0 specify both G and T indiredy, and the pairs are ,is tt d. L 

wtule Va!^3 appears to be consistently ambiguous. 
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TTe binding sire signatures also reveal an important feature of the phage display library 
which u important to the interpretation of the selection results. All the fingers in our 
panel, regardless of the amino acid present at position +6. are able to recognise G or both 
G and T at the 5' end of a triplet. The probable explananrion for this is that the 5' 
position of the middle triplet is fixed as either G or T by a contact from the invariant Asp 
at pomob +2 of finger 3 to the partner of either base on the complementary strand 
analogous to those seen in the Zif268 (Pavletich & Pabo 1991 Science 252, 809-817) and 
tramrack (Fairall et «/.. 1993) crystal structures (a contact to NH, of C or A respectively 
a the major groove). Therefore Asp a: position + 2 of finger 3 is dominant over the 
ammo acid present at position + 6 of the middle finger, precluding the possibility of 
recogmnon of A or C at the 5' position. Future libraries must be designed with this 
interaction omitted or the position varied. Interestingly, given the framework of the 
conserved regions of the three fingers, one can identify a rule in the second finger which 
specifies a frequent interaction with both G and T, viz the occurrence of Ser or Thr at 
position +6, which may donate a hydrogen bond to either base. 

Modulation of base recognition by auxiliary positions. As noted above, position + 9 
is able to specify the base directly 3' of the 'cognate triplet', and can thus work in 
conjunction with position +6 of the preceding finger. The binding site signatures whilst 
pointmg to amino acid-base contacts from the three primary positions, indicate that 
auxihary positions can play other pans in base recognition. A clear case in point is Gin 
at position -1, which is specific for A at the 3' end of a triplet when position + 2 is a 
small non-polar amino acid such as Ala, though specific for T when polar residues such 
as Ser are at position +2. The strong correlation between Arg at position -1 and Asp at 
posmon +2 , the basis of which is understood from the X-ray crystal structures of zinc 
fingers, is another instance of interplay between these two positions. Thus the amino acid 
at position +2 is able to modulate or enhance the specificity of the amino add at other 
positions. 

At position -3. a different type of modulation is seen in the case of Thr and Val which 
most often prefer C in the middle position of a triplet, but in some zinc fingers are able 
to recognise both C and T. This ambiguity occurs possibly as a result of different 
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hydrophobic interactions involving the methyl groups of these residues, and here a 
flexibility in the inclination of the finger rather than an effect from another position per 
se may be the cause of ambiguous reading. 

Quantitative measurements of dissociation constants. The binding site signature of a 
zinc finger reveals its differential base preferences at a given concentration of DNA. As 
the concentration of DNA is altered, one can expect the binding site signature of any clone 
to change, being more distinctive at low [DNA], and becoming less so at htgher [DNA] 
as the K, of less favourable sites is approached and further bases become acceptable at 
each position of the triplet. Furthermore, because two base positions are randomly 
occupied in any one library of oligonucleotides, binding site signatures are not formally 
able to exclude the possibility of context dependence for some interactions. Theretore to 
supplement binding site signatures, which are essentially comparative, quant,tat,ve 
determinations of the equilibrium dissociation constant of each phage for different DNA 
binding sites are required. After phage display election and binding site signatures, these 
are the third and definitive stage in assessing the specificity of zinc fingers. 

Examples of such studies presented in Figure 5 reveal that zinc finger phages bind the 
operators indicated in their binding site signatures with in the range of ICVMO M. and 
can discriminate against closely related binding sites by factors greater than an order ot 
magnitude. Indeed Fieure 5 shows such differences in affinity for binding sites wh.ch 
differ in onlv one out of nine base pairs. Since the zinc fingers in our panel were selected 
from a librarv by non-competitive affinity purification, there is the possibility that fingers 
which are even more discriminatory can be isolated using a competitive selection process. 

Measurements of dissociation constants allow different triplets to be ranked in order of 
preference according to the strength of binding. The examples here indicate that the 
contacts from either position -1 or +3 can contribute to discrimination. Also, the 
ambiguitv in certain binding site signatures referred to above can be shown to have a basts 
in the equal affinitv of certain figures for closely related triplets. This is demonstrated by 
the ^ of the finger containing the ammo acid sequence RGDALTSHER for the triple 
TTG and GTG. 
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A code for zinc finger-DNA recognition. One would expect that the versatility of the 
zinc finger motif will have allowed evolurion to develop various modes or binding to DNA 
(and even to RNA), which will be too diverse to fall under the scope of a single code. 
However, although a code may not apply to all zinc finger-DNA interactions, there is now 
convincing evidence that a code applies to a substantial subset. This code will fall short 
of being able to predict unfailingly the DNA binding site preference of any given zinc 
finger from its amino acid sequence, but may yet be sufficiently comprehensive to allow 
the design of zinc fingers with specificity for a given DNA sequence. 

Using the selection methods of phage display (as described above) and of binding site 
signatures it is found that in the case of ZiG68-like zinc fingers, DNA recognition 
involves four fixed principal (three primary and one auxiliary) positions on the a-helix, 
from where a limited and specific set of amino acid-base contacts result in recognition of 
a variety of DNA triplets. In other words, a code cart describe the interactions of zinc 
fingers with DNA. Towards this code, one can propose amino acid-base contacts for 
almost all the entries in a matrix relating each base to each position of a triplet (Table 2). 
Where there is overlap, the results presented here complement those of Desjarlais and 
Berg who have derived similar rules by altering zinc finger specificity using database- 
guided mutagenesis (Desjarlais & Berg 1992 Proc Natl. Acad. Sci. USA 89, 7345-7349; 
Desjarlais & Berg 1993 Proc. Natl. Acad. Sci. USA 90 , 2256-2260). 

Combinatorial use of the coded contacts. The individual base contacts listed in Table 
2, though part of a code, may not always result in sequence specific binding to the 
expected base triplet when used in any combination. In the first instance one must be 
aware of the possibility that zinc fingers may not be able to recognise certain combinations 
of bases in some triplets by use of this code, or even at all. Otherwise, the majority of 
inconsistencies may be accounted for by considering variations in the inclination of the 
trident reading head of a zinc finger with respect to the triplet with which it is interacting. 
It appears that the identity of an amino acid at any one a-helical position is attuned to the 
identity of the residues at the other two positions to allow three base contacts to occur 
simultaneously. Therefore, for example, in order that Ala may pick out T in the triplet 
GTG, Arg must not be used to recognise G from position +6, since this would distance 
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the former too far from the DNA (see for example the finger containing the amino acid 
sequence RGDALTSHER). Secondly, since the pitch of the a-helix is 3.6 amino acids 
per turn, positions -1, +3 and +6 are not an integral number of turns apart, so that 
position +3 is nearer to the DNA than are -1 or +6. Hence, for example, short amino 
acids such as His and Asn, rather than the longer Arg and Gin, are used for the 
recognition of purines in the middle position of a triplet. 

As a consequence of these distance effects one might say that the code is not really 
"alphabetic" (always identical amino acidrbase contact) but rather "syllabic" (use of a 
small repertoire of amino acidrbase contacts). An alphabetic code would involve only four 
rules, but syllabicity adds an additional level of complexity, since systematic combinations 
of rules comprise the code. Nevertheless, the recognition of each triplet is still best 
described by a code of syllables, rather than a catalogue of "logograms" (idiosyncratic 
amino acid: base contact depending on triplet). 

Conclusions. The "syllabic" code of interactions with DNA is made possible by the 
versatile framework of the zinc finger: this allows an adaptability at the interface with 
DNA by slight changes of orientation, which in turn maintains a stoichiometry of one 
coplanar amino acid per base-pair in many different complexes. Given this mode of 
interaction between amino acids and bases it is to be expected that recognition of G and 
A by Arg and Asn/Gln respectively are important features of the code; but remarkably 
other interactions can be more discriminatory than was anticipated (Seeman etaL, 1976). 
Conversely, it is clear that degeneracy can be programmed in the zinc fingers in varying 
degrees allowing for intricate interactions with different regulatory DNA sequences 
(Harrison & Travers, 1990; Christy & Nathans, 1989). One can see how this principle 
makes possible the regulation of differential gene expression by a limited set of 
transcription factors. 

As already noted above, the versatility of the finger motif will likely allow other modes 
of binding to DNA. Similarly, one must take into account the malleability of nucleic acids 
such as is observed in Fairall et a/., where a deformation of the double helix at a flexible 
base step allows a direct contact from Ser at position +2 of finger 1 to a T at the 3' 
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the first intron of the c-ABL gene and in the breakpoint cluster region of the BCR eene 
(Shtivelman et aL. 1985 Nature 315, 550-554), and give rise to a p210* OM * L gene product 
(Konopka et aL, 1984 Cell 37, 1035-1042). Alternatively, in acute lymphoblastic 
leukaemia (ALL), the breakpoints usually occur in the first introns of both BCR and c-ABL 
(Hermans et a/., 1987 Cell 51, 33-40), and result in a pI90 fCR,4aL gene product (Figure 
6) (Kurzrock et a/., 1987 Nature 325, 631-635). 

Figure 6 shows the nucleotide sequences (Seq ID No.s 9-11) of the fusion point between 
BCR and ABL sequences in pl90 cDNA, and of the corresponding exon boundaries in the 
BCR and c-ABL genes. Exon sequences are written in capital letters while introns are 
given in lowercase. Line 1 shows pl90 MMtt cDNA; line 2 the BCR genomic sequence 
at junction of exon 1 and intron 1 ; and line 3 the ABL genomic sequence at junction of 
intron 1 and exon 2 (Hermans et al 1987). The 9bp sequence in the p!90 K * ylw ' cDNA used 
as a target is underlined, as are the homologous sequences in genomic BCR and c-ABL. 

Facsimiles of these rearranged genes act as dominant transforming oncogenes in cell 
culture (Daley et aL, 1988) and transgenic mice (Heisterkamp et ai, 1990 Nature 344, 
251-253). Like their genomic counterparts, the cDNAs bear a unique nucleotide sequence 
at the fusion point of the BCR and c-ABL genes, which can be recognised at the DNA 
level by a site-specific DNA-binding protein. The present inventors have designed such 
a protein to recognise the unique fusion site in the pl90 BCfc4JU ' c-DNA. This fusion is 
obviously distinct from the breakpoints in the spontaneous genomic translocations, which 
are thought to be variable among patients. Although the design of such peptides has 
implications for cancer research, the primary aim here is to prove the principle of protein 
design, and to assess the feasibility of in vivo binding to chromosomal DNA in available 
model svstems. 

A nine base-pair target sequence (GCA, GAA, GCC) for a three zinc finger peptide was 
chosen which spanned the fusion point of the pl90 flC7MiiL cDNA (Hermans et al., 1987). 
The three triplets forming this binding site were each used to screen a zinc finger phage 
library over three rounds as described above in example 1. The selected fingers were then 
analysed by binding site signatures to reveal their preferred triplet, and mutations to 
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improve specificity were made to the finger selected for binding to GCA. A phage display 
mini-library of putative BCfMBL-binding three-finger proteins was cloned in fd phage, 
comprising six possible combinations of the six selected or designed fingers (1A, IB; 2A; 
3A, 3B and 3C) linked in the appropriate order. These fingers are illustrated in Figure 
7 (Seq ID No.s 12-17). In Figure 7 regions of secondary structure are underlined below 
the list, while residue positions are given above, relative to the first position of the a-helix 
(position 1). Zinc finger phages were selected from a library of 2.6x1 0 6 variants, using 
three DNA binding sites each containing one of the triplets GCC, GAA or GCA. Binding 
site signatures (example 2) indicate that fingers 1A and IB specify the triplet GCC, finger 
2 A specifies GAA, while the fingers selected using the triplet GCA all prefer binding to 
GCT. Amongst the latter is finger 3A, the specificity of which we believed, on the basis 
of recognition rules, could be changed by a point mutation. Finger 3B, based on the 
selected finger 3 A, but in which Gin at helical position +2 was altered to Ala should be 
specific for GCA. Finger 3C is an alternative version of finger 3A, in which the 
recognition of C is mediated by Asp +3 rather than by Thr+3. 

The mini library was screened once with an oligonucleotide containing the 9 base-pair 
BCR-ABL target sequence to select for tight binding clones over weak binders and 
background vector phage. Because the library was small, the inventors did not include 
competitor DNA sequences for homologous regions of the genomic BCR and c-ABL genes 
but instead checked the selected clones for their ability to discriminate. It was found that 
although all the selected clones were able to bind the BCR-ABL target sequence and to 
discriminate between this and the genomic-2? CR sequence, only a subset could discriminate 
against the c-ABL sequence which, at the junction between intron 1 and exon 2, has an 8/9 
base-pair homology to the BCR-ABL target sequence (Hermans ex a/., 1987). Sequencing 
of the discriminating clones revealed two types of selected peptide, one with the 
composition 1A-2A-3B and the other with 1B-2A-3B. Thus both peptides carried the third 
finger (3B) which was specifically designed against the triplet GCA but peptide 1 A-2A-3B 
was able to bind to the BCR-ABL target sequence with higher affinity than was peptide 1B- 
2A-3B. 

The peptide 1A-2A-3B, henceforth referred to as the anti-BCR-ABL peptide, was used in 
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further experiments. The anti-BCR-ABL peptide has an apparent equilibrium dissociation 
constant (KJ of 6.2 +/. 0.4 x 10'M for the P 190— « cDNA sequence in vitro, and 
discriminates against the similar sequences found in genomic BCR and c-ABL DNA, by 
factors greater than an order of magnitude (Figure 8). Referring to Figure 8, (which 
illustrates discrimination in the binding of the anti-BCR-ABL peptide to its plM**** 
target site and to like regions of genomic BCR and c-ABL), the graph shows binding 
(measured as an A^) at various [DNA]. Binding reactions and complex detection by 
enzyme immunoassay were performed as described previously, and a full curve analysis 
was used in calculations of the K„ (Choo & Hug 1993). The DNA used were 
oligonucleotides spanning 9bp either side of the fusion point in the cDNA or the exon 
boundaries. The anti-BCR-ABL peptide binds to its intended target site with a K.-6.2+/- 
0.4 x 10- 7 M, and is able to discriminate against genomic BCR and c-ABL sequences 
though the latter differs by only one base pair in the bound 9bp region. 
The measured dissociation constant is higher than that of three-finger peptides from 
naturally occurring proteins such as Spl (Kadonga et al., 1987 Cell 51, 1079-1090) or 
Zif268 (Christy et al. , 1988), which have IQs in the range of 10-R but rather is 
comparable to that of the two fingers from the tramtrack (ttk) protein (Fairall et al. t 
1992). However, the affinity of the anti-BCR-ABL peptide could be refined, if desired, 
by site-directed mutations or by "affinity maturation" of a phage display library (Hawkins 
etal., 1992 J. Mol. Biol. 226, 889-896). 

Having established DNA discrimination in vitro, the inventors wished to test whether the 
anti-BCR-ABL peptide was capable of site-specific DNA-binding in vivo. The peptide was 
fused to the VP16 activation domain from herpes simplex virus (Fields 1993 Methods 5, 
116-124) and used in transient transfection assays (Figure 9) to drive production of a CAT 
(chloramphenicol acetyl transferase) reporter gene from a binding site upstream of the 
TATA box (Gorman et al., Mol. Cell. Biol. 2, 1044-1051). In detail, the experiment was 
performed thus: reporter plasmids pMCAT6BA, pMCAT6A, and pMCAT6B, were 
constructed by inserting 6 copies of the pl90*»-^ target site (CGCAGAAGCC), the 
c-ABL second exon-intron junction sequence (TCC AG AAGCC) , or the BCR first 
exon-intron junction sequence (CGCAGGTGAG) respectively, into pMCAT3 (Luscherer 
aL, 1989 Genes Dev. 3, 1507-1517). The anri-BCR-ABL/VPl 6 expression vector was 
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generated by inserting the in-frame fusion between the activation domain of herpes simple: 
virus VP16 (Fields 1993) and the Zn finger peptide in the pEF-BOS vector (Mizushun 
& Shigezaku 1990 Nucl. Acids Res. 18. 5322). C3H10T1/2 ceils were transient! 
co-transfected with 10 n of reporter plasmid and lO^g of expression vector. RSVL (d 
Wet et «/. . 1987 Mol. Cell Biol. 7, 725-737). which contains the Rous sarcoma vm* lon : 
terminal repeat linked to luciferase, was used as an internal control to normahse fo 
differences in transfection efficiency. Cells were transfected by the calcium phosphat 
precipitation method and CAT assays performed as described (Sanchez-Garcia « al. , 199 
EMBO J. 12, 4243-4250). Plasmid pGSEC, which has five consensus 17-me 
GAL4-binding sites upstream from the minimal promoter of the adenovirus Elb TAT, 
box, and pMlVP16 vector, which encodes an in-frame fusion between the DNA-bimkn 
domain of GAL4 and the activation domain of herpes simplex virus VP16, were used z 
a positive control (Sadowski * «/., 1992 Gene 118, 137-141 ). The results are shown , 
Figure 9. 

Referring to Figure 9, C3H10T1/2 cells were transiently cotransfected with a CAT 
reporter plasmid and an anti-BCR-ABL/VP16 expression vector (pZNIA). The top panel 
of the figure shows the results of thin layer chromatography of samples from different 
transfections, in which the fold induction of CAT activity relative to a sample whe, 
reporter alone was transfected (panel 1) is plotted on a histogram below. 

A specific (thirtv-fold) increase in CAT activity was observed in cells cotransfected wi. 
reporter plasmid bearing copies of the P 190— cDNA target site, compared to a bare) 
detectable increase in cells cotransfected with reporter plasmid bearing copies of either tr 
BCR or C'ABL semihomologous sequences, indicating in vivo binding. The parucul; 
constructs used in different transfections are noted below the histogram. 

The selective stimulation of transcription indicates convincingly that highly site-specif 
DNA-binding can occur in vivo. However, while transient transfections assay binding 
plasmid DNA. the true target site for this and most other DNA-binding proems is 
genomic DNA. This might well present significant problems, not least since this DN 
is phvsicallv separated from the cvtosol by the nuclear membrane, but also since it nu 
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be packaged within chromatin. 
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, Figure 10 (immunofluorescence of Ba/F3 + p»0 and B»/F3 + p210 ells .ransienuy 
^eced wirh ft. anti-bcr-ab, egression vecior and suined wift ft. 9E10 antibody). 
. image shows expression and nuclear localisation of ft. anti-BCR-ABL peptide (pane* 
C and D). in addition, rransfecttd Ba/F3 + pl«> cells show chrom.n» condensaoon 
nuciear fragment I. sma>l apoprotic bodies (panels B, and C). buc no, eUhe, 
mransfecred Ba/F 3+ pl*> celis (pane. A) or transfeered Ba/F3 +P 210 ceUs (pane, D). 

.„ efficiency of rransienr transaction, measured as the proportion of immunofluorescent 

„e is When IL-3 is withdrawn from nssue culture, a 

ells in the population, was 15-20%. wnen u. -> 

orresponding proponion of Ba/F3 +P 190 ceUs are found to nave reverted - factor 
ependence and die, whiie Ba/F3 +P 210 CIs are unaffecied. The experimental^ 
„ re as foliows: ce,l lines Ba/F3. Batf3 +P 190 and Ba/F3 +P 210 were maintamed « 
toco's modified Eagie's medium (DMEM) supplemented with 10% fetal bovme 
erum In UK case of Ba/F3 ceil line 10% WEHMfrcondirioned medium « eluded 
s a source of IL-3. After rhe rransfecrion with the anti-BCR-ABL expression v««or. cell, 
ox.0>,m» were washed rwice in serum-free medium and cultured in DMEM medium win 
10% feu, bovine serum without WEHMB^uditioned medium. Percentage viabthry was 
defined by rryoan b,ue exclusion. Dara are express* as means of rripiicate cu,tur«. 
lie results are shown in graphical form in Figure 11. 

• mmunofluorescence microscopy of transit Ba/F3 + p.90 cells in the absence ofIL-3 
bows chromarin condensation and nuciear fragmentation into small apoptotic bod.es 
,hi)e >h. nuCei of Ba/F3 +P 210 celis remain intact (Figure 10). Nonhem biotsof 
vtoplasmic WA from Ba/F3 +P l90 ceUs transiently transferred wirb the ano-BCR-ABL 

if .ionwt^ mRNA relative to untransfected cells, by 
.eptide revealed reduced levels of pl90"» mRNA reiati 

ontras, similarly .ransfecred Ba/F3 + p210 ceUs showed no decree in the Imb « 
',,10— mRNA (Figure 12). The blors were performed . foliows: 10 M of ..... 
™plasm,c RNA. from rhe ceiis indicated, was glyoxyUted and fractional in M% 
, M ,ose gels in iOmM NaPO. buffer, pH 7.0. After eiecrrophoresis ft. * , « . bW- 
."mo Hvbond-N (Amersham). UV-cross Unxed and hybridised „ - -MM- 

r„, idh at 10«C Loading was monitored by reprobmg the 
irobe. Autoradiography was for 14h at -70 L. u>aamg 

liters with a mouse 0-actin cDNA. 
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Referring to Figure 12, (Northern filter hybridisation analysis of Ba/F3 + pl90 and 
Ba/F3+p210 cell lines transfected with the anti-BCR-ABL expression vector), lane 1 is 
from untransfected Ba/F3+pl90 cell line; lanes 2, and 3 are from Ba/F3+pl90 cell line 
transfected with the anti-BCR-ABL expression vector; lane 4 is from untransfected 
Ba/F3+p210 cell line; lanes 5 and 6 are from Ba/F3 + p210 cell line transfected with the 
anti-BCR-ABL expression vector. When transfected with the anti-BCR-ABL expression 
vector, a specific downregulation of pWO 8 ™'** 1 mRNA is seen in Ba/F3+pl90 cells, while 
expression of P 210 BCR ' ABL is unaffected in Ba/F3+p210 cells. 

In summary, the inventors have demonstrated that a DNA-binding protein designed to 
recognise a specific DNA sequence in vitro, is active in vivo where, directed to the 
nucleus by an appended localisation signal, it can bind its target sequence in chromosomal 
DNA. This is found on otherwise actively transcribing DNA, so presumably binding of 
the peptide blocks the path of the polymerase, causing stalling or abortion. The use of a 
specific polypeptide in this case to target intragenic sequences is reminiscent of antisense 
oligonucleotide- or ribozyme- based approaches to inhibiting the expression of selected 
genes (Stein & Cheng 1993 Science 261, 1004-1012). Like antisense oligonucleotides, 
zinc finger DNA-binding proteins can be tailored against genes altered by chromosomal 
translocations, or point mutations, as well as to regulatory sequences within genes. Also, 
like oligonucleotides which can be designed to repress transcription by triple helix 
formation in homopurine-homopyrimidine promoters (Cooney et a/., 1988 Science 245, 
725-730) DNA-binding proteins can bind to various unique regions outside genes, but in 
contrast they can direct gene expression by both up- or down- regulating, the initiation of 
transcription when fused to activation (Seipel et a/., 1992 EMBO J. 11, 4961-4968) or 
repression domains (Herschbach et a/., 1994 Nature 370, 309-311). In any case, by 
acting directly on any DNA, and by allowing fusion to a variety of protein effectors, 
tailored site-specific DNA-binding proteins have the potential to control gene expression, 
and indeed to manipulate the genetic material itself, in medicine and research. 

Example 4 

The phage display zinc finger library described in the preceding examples could be 
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considered sub-optimal in a number of ways:- 

i) the library was much smaller than the theoretical maximum size; 

ii) the flanking fingers both recognised GCG triplets (in certain cases creating nearly 
symmetrical binding sites for the three zinc fingers, which enables the peptide to bind to 
the 'bottom' strand of DNA, thus evading the register of interactions we wished to set); 

iii) Asp+2 of finger three ("Asp+ + 2") was dominant over the interactions of finger two 
(position +6) with the 5' base of the middle triplet; 

iv) not all amino acids were represented in the randomised positions. 

In order to overcome these problems a new three-finger library was created in which: 

a) the middle finger is fully randomised in only four positions (-1, +2, +3 and +6) so 
that the library size is smaller and all codons are represented. The library was cloned in 
the pCANTABSE phagemid vector from Pharmacia, which allows higher transformation 
frequencies than the phage. 

b) the first and third fingers recognise the triplets GAC and GCA, respectively, making 
for a highly asymmetric binding site. Recognition of the 3\A in the latter triplet by finger 
three is mediated by Gln-l/Ala+2, the significance of which is that the short Ala+2 
should not make contacts to the DNA (in particular with the 5' base of the middle triplet), 
thus alleviating the problem noted at (iii) above. 

Example 5 

The human ras gene is susceptible to a number of different mutations, which can convert 
it into an oncogene. A ras oncogene is found in a large number of human cancers. One 
particular mutation is known as the G12V mutation (i.e. the polypeptide encoded by the 
mutant gene contains a substitution from glycine to valine). Because ras oncogenes are 



BNSDOCID:<WO 9606 1 66 A 1 I > 



WO 96/06166 

49 PCT/GB95/01949 

so common in human caneprc f K ft „ * 

^pcutic «■»*. ^ " anttK>y •*■«» f " ~ 

Afore finger pr0Km ta ^ ^ ^ ^ 

* P™e» was produced using ntjomi dMig „ baMd M * - 

. - f™ k (from one of Cmgm ^Jl b J c ^ J° 

:;r rr to ^ +3 ro yKu - ~ 

mcreni "P^B- The finger recognising GCC and rh* n»o ^- 
PCANTAB5E a* pressed m ^ ™ « *- * 

T*** ^ " r " BP " ^ '° *» ' — ■»* Of 

i gc^t r :ba * coraaas shou,d *• " - " - — * ^ ^ 

the GCC triplet should be recognised by +3 Asd or n„ c 

above) Th USfl rh r ° y +3 ^ or G,u ' °r Ser, or Tnr (see Table 2 

-iDove;. inus a three-finger peptide gene was «« m w.^ o 

oligonucleotides which were ~ 2 , T 

ita-Mf.- " «U l*wd according » sandard procedures and 

^ a 2% agarose seL ^ e ™ «* 1 — ' 

r " mati0n " P " M » +J «» inclusion of each o, rhe 

ahove an™ acids (D, E, S & T) and also certain orher residues which were in 2 
predicted to be desirable (e e Asirt tt,. c u , 

SKI a nH a/ r , 7 Ct,C 0li S° nuc,e °*** were designed to have 

SJil and Notl overhangs when annealed The~™nw 

- «K _ and *e hga 00 n ^J2^ZZ 2T 

- « ta ^ as prev,ous, y descrihed and , seieCon J^T^ 

^ h Tl " teCribe< " ,0 *» - — 

of the library which bound poorly. P g 

Rowing selection, a number of separate clones were 1S0 ,ated and phage produced frotn 
these were screened by EUSA for binding to the GPV ras seauenl l H , 
apaincr r K» „ sequence and discnmuiation 

gams. ,he wud-rvpe ras seouence. A number of Cones were ab,e ,o do dus and 
« o, phage D NA , tt r revealed te ^ f e„ to « clttgori e s . 0De " 

undesirable mutations. 



BNSDOCID:<WO 9606 1 66 A 1 I > 



PCT/GB95/01949 

WO 96/06166 

The appearance of Asn at position +3 is unexpected and most probably due to the fact that 
proteins with a cytosine-specific residue at position +3 bind to some E. coli DNA 
sequence so tightly that they are lethal. Thus phage display selection is not always 
guaranteed to produce the tightest-binding clone, since passage through bacteria is essential 
to the technique, and the selected proteins may be those which do not bind to the genome 
of this host if such binding is deleterious. 

K, measurements show that the clone with Asn+3 nevertheless binds the mutant G12V 
sequence with a K, in the nM range and discriininates against the wild-type ras sequence. 
However it was predicted that Asn+3 should specify an adenine residue at the middle 
position, whereas the polypeptide we wished to make should specify a cytosine for 

oiptimal binding. 

Thus we assembled a three-finger peptide with a Ser at position + 3 of Finger 1 (as shown 
in Figure 15), again for using synthetic oligos. This time the gene was ligated to 
pC ANT AB5E phagemid. Transforms were isolated in the E. coli ABLE-C strain (from 
Stratagene) and grown at 30°C, which strain under these conditions reduces the copy 
number of plasmids so as to make their toxic products less abundant in the cells. 

The amino acid sequence (Seq ID No. 18) of the fingers is shown in Figure 15. The 
numbers refer to the a-helical amino acid residues. The fingers (Fl, F2 & F3) bind to 
the G12V mutant nucleotide sequence: 5' GAC GGC GCC 3' 

F3 F2 Fl 

The bold A shows the single point mutation by which the G12V sequence differs from the 

wild type sequence. 

Assay of the protein in eukaryotes (e.g. to drive CAT reporter production) requires the 
use of a weak promoter. When expression of the anti-RAS (GUV) protein is strong, the 
peptide presumably binds to the wild-type ras allele (which is required) leading to cell 
death. For this reason, a regulatable promoter (e.g. for tetracycline) will be used to 
deliver the protein in therapeutic applications, so that the intracellular concentration of the 
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protein exceed, the Kd for dte G12V point mutated gene but no, tne Kd for the wiK-rvpe 
allele. Since the G12V mutadon is a natural* occurring genomic mutauon (not only a 
cDNA muauon as was dte plM bcr-abl) human ceU lines and other animal mode!* can 

be used in research. 

In addition to repressing the expression of the gene, the protein can be used to diagnose 
the precise point mutation present in the genomic DNA, or more likely in PGR amplified 
genomxc DNA. without sequencing. It should therefore be possible, without further 
mvennve activity, to design diagnostic kits for detecting (e.g. point) mutauons on DNA. 
EUSA-based methods should prove particularly suitable. 

It is hoped to fuse the zinc finger binding polypeptide to an scFv fragment which binds 
to the human transferrin receptor, which should enhance delivery to and uptake by human 
cells. Tte transferrin receptor is thought particularly useful but, in theory, any receptor 
molecule (preferably of high affinity) expressed on the surface of a human target cell could 
act as a suitable l lg and, either for a specific immunoglobulin or fragment, or for the 
receptor's natural ligand fused or coupled with the zinc finger polypeptide. 
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1) GENERAL INFORMATION: 

fi ) APPLICANT: 

(A) NAME: Medical Research Council 

(B) STREET: 20 Park Crescent 

(C) CITY: London 

(E) COUNTRY: United Kingdom 

(F) POSTAL CODE (ZIP): WIN 4AL 

(ii) TITLE OF INVENTION: Improvements in or Relating to Binding 
Proteins for Recognition of DNA 

( i i i ) NUMBER OF SEQUENCES : 18 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE : Patentln Release #1.0. Version #1.30 (EPO) 

.2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
:tcctgcagt TGGACCTGTG CCATGGCCGG CTGGGCCGCA TAGAATG5AA CAACTAAAGC 60 

;2) INFORMATION FOR SEQ ID NO: 2: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 92 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Glu Glu Arg Pro Tyr Ala Cys Pro Val Glu Ser Cys Asp Arg 
1 5 10 ^ 

Arg Phe Ser Arg Ser Asp Glu Leu Thr Arg His He Arg lie His Thr 

20 25 30 
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61, Gin lys Pro Pne Gin Cys Arg I,e Cys Het Arg Asn P„ e Ser Xaa 

45 

Xaa Xaa Xaa Leu Xaa Xaa His Xaa Arg Thr His Thr Gly Gl u Lys Pro 

Phe Ala Cys Asp He Cy S Gly Arg Lys phe Arg ^ ^ ^ ^ 

75 80 
Lys Arg His Thr Lys He His Leu Arg Gin Lys Asp 



90 



(2) INFORMATION FOR SEQ ID NO: 3: 
(T) SEQUENCE CHARACTERISTICS- 

B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 
TATGACTTGG ATGGGAGACC GCCTGG 

(2) INFORMATION FOR SEQ ID NO: 4: 
(1) SEQUENCE CHARACTERISTICS- 

ffl! ™P : 2 ? base P airs 

(B TYPE: nucleic acid 
(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AATTCCAGGC GGTCTCCCAT CCAAGTCA 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TATATAGCGT GGGCGTATAT A 

(2) INFORMATION FOR SEQ ID NO: 6: 

(1) SEQUENCE CHARACTERISTICS- 
(A) LENGTH: 24 base pairs 



26 



28 



21 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(0) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCGTATATAC GCCCACGCTA TATA 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TATATAGCGN NNGCGTATAT A ' 



(2) INFORMATION FOR SEQ ID NO: 8: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GCGTATATAC GCNNNCGCTA TATA 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xt) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TTCCATGGAG ACGCAGAAGC CCTTCAGCGG CCA 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TTCCATGGAG ACGCAGGTGA GTTCCTCACG CCA 33 

(2) INFORMATION FOR SEO ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCCCTTTCTC TTCCAGAAGC CCTTCAGCGG CCA 33 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ala Glu Glu Lys Pro Phe Gin Cys Arg He Cys Met Arg Asn Phe 
15 10 15 

Ser Asp Arg Ser Ser Leu Thr Arg His Thr Arg His Thr Gly Glu Lys 

^ 25 30 

Pro 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 33 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS- 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Ala Glu Glu Lys Pro Phe Gin Cys Arg He Cys Met Arg Asn Phe 
15 10 15 
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Ser Glu Arg Gly Thr Leu Ala Arg His Glu Lys His Thr Gly Glu Lys 

20 25 30 

Pro 

(2) INFORMATION FOR SEQ ID NO: 14: 

■ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2T amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Phe Gin Cys Arg He Cys Met Arg Asn Phe Ser Gin Gly Gly Asn Leu 
15 10 15 

Val Arg His Leu Arg His Thr Gly Glu Lys Pro 

20 25 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) . SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TiPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

( i i ) MOLECULE TYPE : pepti de 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Phe Gin Cys Arg He Cys Met Arg Asn Phe Ser Gin Ala Gin Thr Leu 
15 10 15 

Gin Arg His Leu Lys His Thr Gly Glu Lys 

20 25 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: . 
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Phe Gin Cys Arg n e Cys Met Arg Asn Phe Ser Gin Ala Ala Thr Leu 

10 15 
Gin Arg His Leu Lys His Thr Gly Glu Lys 

20 25 

(2) INFORMATION FOR SEO ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Phe Gin Cys Arg lie Cys Met Arg Asn Phe Ser Gin Ala Gin Asp Leu 

' 10 15 

Gin Arg His Leu Lys His Thr Gly Glu Lys 

20 . 25 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS- 

Si! k$B TH: 89 amino ^ids 

(B) TYPE: amino acid 

(C) STRANDEDNESS- 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEOUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Ala Glu Glu Lys Pro Phe Gin Cys Arg He Cys Met Arg Asn Phe 

10 15 
Ser Asp Arg Ser Ser Leu Thr Arg His Thr Arg Thr His Thr Gly Glu 

25 30 
Lys Pro Phe Gin' Cys Arg lie Cys Met Arg Asn Phe Ser Asp Arg Ser 

His Leu Thr Arg His Thr Arg Thr His Thr Gly Glu Lys Pro Phe Gin 

C^s Arg He Cys Met Arg Asn Phe Ser Asp Arg Ser Asn Leu Tnr Arg 

/U 7 5 80 

His Thr Arg Thr His Thr Gly Glu Lys 

85 
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Claims 



1 . A library of DNA sequences, each sequence encoding at least one zinc finger binding 
motif for display on a viral panicle, the sequences coding for zinc finger binding motifs 
having random allocation of amino acids at positions -1. +2, +3, +6 and at least at one 
of positions +1, +5 and +8. 

2. A library of DNA sequences, each sequence encoding the zinc finger binding motif 
of at least a middle finger of a zinc finger binding polypeptide for display on a viral 
panicle, the sequence coding for the binding motif having random allocation of amino 
acids at positions -1, +2, +3 and +6. 

3. A library of sequences according to claim 2, wherein the sequences coding for the 
binding motif have further random allocation of amino acids at one or more of positions 
+ 1, +5 and +8. 

4. A library of sequences according to any one of claims 1, 2 or 3, wherein the 
sequences coding for the binding motif have random allocation of amino acids at positions 
4-1, +5 and 4-8. 

5. A library of sequences according to any one of the preceding claims, wherein the 
sequence encoded comprises a zinc finger polypeptide comprising a plurality of zinc 
fingers, adjacent fingers being joined by an intervening linker peptide. 

6. A library of sequences according to any one of the preceding claims, wherein the 
. sequence encoded comprises a zinc finger of the Zif 268 polypeptide. 

7. A library of sequences according to any one of the preceding claims, wherein the 
sequence encoded comprises a zinc finger having random allocation of amino acids, 
positioned between two or more zinc fingers having a defined amino acid sequence. 



8. 



A library of sequences according to any one of the preceding claims, in a form 
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suitable for cloning as a fusion with the 



minor coat protein of bacteriophage fd. 



9. A method of designing a zinc finger P o, yp e pt i de for binding to a particular tareet DNA 
sequence, comprising screening each of a plurality of zinc finger binding motifs against 
at least an effective portion of the target DNA sequence, and selecting those motifs which 
bind to the target DNA sequence. 

JUT" accMdfas 10 claim wbmin wo or more rounds of - 

IL A meted of designing a zinc fmger f0f bWtag [o a 

DNA sequence, comprising comparing ^ ^ of ^ ^ a ^ ^ 

bn^g modfe .0 ™ or core DNA aiplecs. and ^ ^ m0Ijft 
preferable binding characteristics. 

12. A memod according ,o claim 11. forte COInprising „ ^ x 
according to claim 9 or 10. 

13. A method of designing a zinc finger polypeptide for binding to a target DNA 
sequence, comprising combining in a single zinc finger polypeptide a plurality of zinc 
fmger bmding motifs, each of which has been screened by the method of claim 9 or 10 
and/or selected by the method of claim 11 or 12. 

14. A mchod according co claim 13. wnercin me tokening linker ^ ^ 
adjaccm zmc finger binding moufs is that prcsen, in a namrally occurring zinc finger 
hndrng polypeptide, or j, an artificial pepude sequence, or is an anificial non-amino acid 

linker. 

15 A zinc finger polypeptide for binding to a target DNA sequence, designed according 
to the method of any one of claims 9 to 14. 

16. A DNA library consisting of 64 sequences, each sequence comprising a different one 
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of the 64 possible permutations of three DNA bases in a form suitable for use in the 
selection method of claim 11 or 12. 

17. A library according to claim 16, wherein the sequences are associated, or are capable 
of being associated, with separation means. 

18. A library according to claim 17, wherein the separation means is selected from one 
of the following: microritre plate; magnetic or non-magnetic beads or particles capable of 
sedimentation; and an affinity chromatography column. 

19. A library according to any one claims 16. 17 or 18, wherein the sequences are 
biotinylaied. 

20. A library according to any one of claims 16 to 19, wherein the sequences are 
contained within 12 niini-Ubraries. 

21. A kit for making a zinc finger polypeptide for binding to a nucleic acid sequence of 
interest, comprising: a library of DNA sequences encoding zinc finger binding motifs of 
known binding characteristics in a form suitable for cloning into a vector; a vector 
molecule suitable for'accepting one or more sequences from the library; and instructions 
for use. 

22. A kit according to claim 21, wherein the vector is capable of directing the expression 
of the cloned sequences as a single zinc finger polypeptide. 

23. A kit according to claim 21 or 22, wherein the vector is capable of directing the 
expression of the cloned sequences as a single zinc finger polypeptide displayed on the 
surface of a viral panicle. 

24. A kit for making a zinc finger polypeptide for binding to a nucleic acid sequence of 
interest, comprising: a library of DNA sequences, each encoding a zinc finger binding 
motif in a form suitable for screening according to the method of claim 9 or 10, and/or 
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selecting according to the method of claim 11 or 12; and instructions for 



use. 



25. A kit according to claim 24, wherein the library of DNA sequences is in accordance 
with any one of claims 1 to 8. 



26. A kit according to claim 24 or 25. further comprising a library according to any one 
claims 16 to 20. 



27. A kit according to any one claims 24, 25 or 26 further comprising appropriate buffer 
solutions and/or reagents for detection of bound zinc finger motifs. 

28. A kit according to any one of claims 24 to 27. further comprising a vector suitable 
for accepting one or more sequences selected from the library of DNA sequences encoding 
zinc finger binding motifs. 

29. A method of altering the expression of a gene of interest in a target cell, comprising: 
detemiining (if necessary) at least part of the DNA sequence of the structural region 
and/or a regulatory region of the gene of interest; designing a zinc finger polypeptide to 
bind to the DNA of determined sequence, and causing said zinc finger polypeptide to be 
present in the target cell. 

30. A method according to claim 29, wherein the zinc finger polypeptide is designed in 
accordance with any one of claim* 9.14. 

31 . A method according to claim 29 or 30, wherein the zinc finger polypeptide comprises 
one or more further functional domains. 

32. a method according to any one of claims 29, 30 or 31, wherein the zinc finger 
polypeptide comprises a nuclear localisation signal so as to deliver the zinc finger 
polypeptide to the nucleus of the target cell. 

33. A method according to any one of claims 29 to 32, wherein the zinc finger 
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polypeptide comprises the nuclear localisation signal from the large T antigen of SV40. 

34. A method according to any one of claims 29 to 33, wherein the zinc finger 
polypeptide is caused to be present in the target cell by delivery into the cell of DNA 
directing the intracellular expression of the polypeptide. 

35. A method of inhibiting cell division by altering the expression of a gene in 
accordance with the method of any one of claims 29 to 34, wherein the gene is one 
involved in regulating cell division. 

36. A method of treating cancer, comprising delivering to a patient, or causing to be 
present therein, a zinc finger polypeptide which inhibits the expression of a gene enabling 
the cancer cells to divide. 

37. A method of modifying a nucleic acid sequence of interest present in a sample 
mixture by binding thereto a zinc finger polypeptide, comprising contacting the sample 
mixture with a zinc finger polypeptide having affinity for at least a portion of the sequence 
of interest, so as to allow the zinc finger polypeptide to bind specifically to the sequence 
of interest. 

38. A method according to claim 37, wherein the zinc finger polypeptide is designed in 
accordance with the method of any one of claims 9 to 14. ' 

39. A method according to claim 37 or 38, further comprising the step of separating the 
zinc finger polypeptide (and nucleic acid sequences specifically bound thereto) from the 
rest of the sample. 

40. A method according to any one of claims 37, 38 or 39, wherein the zinc finger 
polypeptide is bound to a solid phase support. 

41 . A method according to any one of claims 37 to 40, wherein the presence of the zinc 
finger polypeptide bound to the sequence of interest is detected by the addition of one or 
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42. A method according to any one of claims 37 to 41 where* the DNa 

interest is present in an acrylamide or agarose ee , mJ' ***** * 

a membrane. * " " presem on the of 

43. A zinc finger polypeptide capable of inhibiting the expression «f „- 

gene . expression of a disease-associated 

44. A zinc finger polypeptide according to claim 43 wh «-- .r ■ 
namrally-occurri™ 9 „h ■ Peptide is not 

rally occumng and is specifically designed to inhibit the expression „f * „■ 
associated gene. expression of the disease- 

45. A zinc finger polypeptide according to claim 43 or 44 design k u 

one of claims 9 to 14. ^ " y method of "V 

46. A zinc finger polypeptide according to any one of claims *m 

inhibiting the expression of an oncogene. ' " ^ ° f 

47 A zinc finger polypeptide according to any one of claims 43 to 46 capable of 
-nibumg the expression of a BCR-ABL fusion oncogen, ' " 

48. A zinc finger polypeptide according to any one of claims 43 to 47 designed f k- h 
to the DNA sequence GCAGAAGCC. g ° bmd 

49 A zmc finger polypeptide according to anyone of claims 43 to 46 canable of 
mhibiting the expression of a ras oncogene. ? ° f 

1 A TJ m?er POlyPePtidC aCC ° rd " S t0 «** 49 ' des *ned » bind to the DNA 

sequence GACGGCGCC. 
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